Thomson Reuters at TAC 2009: ContextChain and Fractional Conditional Compressibility of Models

نویسندگان

  • Frank Schilder
  • Ravikumar Kondadadi
  • Sriharsha Veeramachaneni
چکیده

This paper contains the results for the FastSum system and a simple baseline system for the TAC 2009 main task – update summarization –. For the pilot task of Automatically Evaluating Summaries of Peers (AESOP), we present two novel metrics. The first metric called ContextChain is an extension of a recently proposed metric AutoSummENG that is based on comparing n-gram graphs of the model summaries and the automatically generated summaries. Our modification of the generated n-gram graphs is based on co-reference chains extracted from the summaries. The ngram graph is then generated from the context information of these referents. Our second metric called Fractional Conditional Compressibility of Models (FraCC) is based on the BurrowsWheeler compression algorithm. For this evaluation metric, we use an estimate of the conditional “compressibility” of the model summaries given the system summary. The conditional compressibility is defined as the increase in the compressibility of the model summary when the system summary has been observed. In addition to presenting our two new approaches to automatically evaluating summaries, we will introduce two new evaluation measures for automatic metrics called Correlation Recall and Correlation Precision and discuss how they can cast more light on the coverage and the correctness of the evaluation metrics for summarization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Benchmarks for Enterprise Linking: Thomson Reuters R&D at TAC 2013

This paper describes the TRRD systems entered in the TAC 2013 entity linking challenge. We explore a restricted version of the task that accesses only an entity authority file with (possibly noisy) alternative names and plain text from the target domain. This is designed to reflect the problem of linking to existing entity authorities within companies like Thomson Reuters. We used the 2013 shar...

متن کامل

Thomson Reuters at TAC 2008: Aggressive Filtering with FastSum for Update and Opinion Summarization

In TAC 2008 we participated in the main task (Update Summarization) as well as the Sentiment Summarization pilot task. We modified the FastSum system (Schilder and Kondadadi, 2008) and added more aggressive filtering in order to adapt the system to update summarization and sentiment summarization. For the Update Summarization task, we show that a classifier that identifies sentences that are si...

متن کامل

Information access in practice: case studies at Thomson Reuters

Isabelle Moulinier is a research scientist at Thomson Reuters corporate R&D group. Since joining the group 15 years ago, her research interests have focuses on the application of information retrieval, natural language processing and machine learning technologies to the improvement of search and other aspects of the user experience. Prior to joining Thomson Reuters, she worked on text categoriz...

متن کامل

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

This paper presents our extractive summarization systems at the update summarization track of TAC 2009. This system is based on our newly developed document summarization framework under the theory of conditional information distance among many objects. The best summary is defined in this paper to be the one which has the minimum information distance to the entire document set. The best update ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009